{smcl}
{* 12aug2020}{...}
{cmd:help mata mm_linbin()}
{hline}

{title:Title}

{p 4 4 2}
{bf:mm_linbin() -- Linear and exact binning}

{title:Syntax}

{p 8 23 2}
{it:real colvector}
[{cmd:_}]{cmd:mm_linbin(}{it:x}{cmd:,} {it:w}{cmd:,} {it:g}{cmd:)}

{p 8 23 2}
{it:real colvector}
{cmd:mm_fastlinbin(}{it:x}{cmd:,} {it:w}{cmd:,} {it:g}{cmd:)}

{p 8 23 2}
{it:real colvector}
[{cmd:_}]{cmd:mm_exactbin(}{it:x}{cmd:,} {it:w}{cmd:,} {it:g}
[{cmd:,} {it:dir}{cmd:,} {it:include}]{cmd:)}

{p 8 23 2}
{it:real colvector}
{cmd:mm_fastexactbin(}{it:x}{cmd:,} {it:w}{cmd:,} {it:g}
[{cmd:,} {it:dir}{cmd:,} {it:include}]{cmd:)}

{p 8 23 2}
{it:real colvector}
{cmd:mm_makegrid(}{it:x} [{cmd:,} {it:m}{cmd:,}
{it:e}{cmd:,} {it:min}{cmd:,} {it:max}]{cmd:)}


{p 4 4 2}
where

{p 12 16 2}
{it:x}:  {it:real colvector} containing data points

{p 12 16 2}
{it:w}:  {it:real colvector} containing weights

{p 12 16 2}
{it:g}:  {it:real colvector} containing (sorted) grid points

{p 10 16 2}
{it:dir}:  {it:real scalar} idicating the direction of the
intervals (default: right open)

{p 6 16 2}
{it:include}:  {it:real scalar} idicating that data outside the grid be
included in the first and last bins

{p 12 16 2}
{it:m}:  {it:real scalar} specifying the number of equally spaced grid
points (default is 512)

{p 12 16 2}
{it:e}:  {it:real scalar e} extending the grid range

{p 10 16 2}
{it:min}:  {it:real scalar} specifying the minimum grid value
(default: {cmd:min(x)} - {it:e})

{p 10 16 2}
{it:max}:  {it:real scalar} specifying the maximum grid value
(default: {cmd:max(x)} + {it:e})


{title:Description}

{p 4 4 2}
{cmd:mm_linbin()} returns linearly binned counts of
{it:x} at the grid points {it:g} ({it:g} must be sorted). {cmd:_mm_linbin()} does
the same but assumes {it:x} to be sorted.

{p 4 4 2}
{cmd:mm_fastlinbin()} returns linearly binned counts of
{it:x} at the grid points {it:g}, where {it:g} is assumed to be a (sorted) regular grid
of equidistant points. {cmd:mm_fastlinbin()} does not need to sort the data and is thus
faster than {cmd:mm_linbin()}, at least in large datasets. If the data has already been
sorted, however, {cmd:_mm_linbin()} can be used, which will be faster than
{cmd:mm_fastlinbin()}.

{p 4 4 2} {cmd:mm_exactbin()} returns counts of {it:x} within
the intervals defined by the grid points {it:g} ({it:g}
must be sorted). {cmd:_mm_exactbin()} does
the same but assumes {it:x} to be sorted.

{p 4 4 2}
{cmd:mm_fastexactbin()} returns counts of {it:x} within
the intervals defined by the grid points {it:g}, where {it:g} is assumed to be a (sorted) regular grid
of equidistant points. {cmd:mm_fastexactbin()} does not need to sort the data and is thus
faster than {cmd:mm_exactbin()}, at least in large datasets. If the data has already been
sorted, however, {cmd:_mm_exactbin()} can be used, which will be faster than
{cmd:mm_fastexactbin()}.

{p 4 4 2}
The default for {cmd:mm_exactbin()} and {cmd:mm_fastexactbin()} is to use right open intervals
(with the last interval closed). Specify {it:dir}!=0
to use left open intervals (with the first interval
closed). {cmd:mm_exactbin()} and {cmd:mm_fastexactbin()} do not allow {it:x} to contain data
outside the grid range, unless {it:include}!=0 is specified, in which case
such data is included in the first and last bin, respectively.

{p 4 4 2}Argument {it:w} in the above functions specifies weights associated
with the observations in {it:x}. Specify {it:w} as 1 to obtain unweighted
results. The sum of returned counts is equal to the sum of weights.

{p 4 4 2}
{cmd:mm_makegrid()} returns a grid of {it:m}
equally spaced points over {it:x}. The default range of the grid is
[{cmd:min(}{it:x}{cmd:)},{cmd:max(}{it:x}{cmd:)}].
If {it:e} is specified, the range is set to
[{cmd:min(}{it:x}{cmd:)}-{it:e},{cmd:max(}{it:x}{cmd:)}+{it:e}].
Alternatively, specify {it:min} and/or {it:max} to
determine the limits of the grid range ({it:e} will be ignored in this case).


{title:Remarks}

{p 4 4 2}Linear binning: Let g(j) and g(j+1) be the two nearest grid points
below and above observation x.
Then w*(g(j+1)-x)/(g(j+1)-g(j)) is added to the count at
g(j) and w*(x-g(j))/(g(j+1)-g(j)) is added to the count at
g(j+1), where w is the weight associated with x. Data below (above) the grid range
is added to the count of the first (last) grid point.


{title:Conformability}

    [{cmd:_}]{cmd:mm_linbin(}{it:x}{cmd:,} {it:w}{cmd:,} {it:g}{cmd:)}
    {cmd:mm_fastlinbin(}{it:x}{cmd:,} {it:w}{cmd:,} {it:g}{cmd:)}
             {it:x}:  {it:n x} 1
             {it:w}:  {it:n x} 1 or 1 {it:x} 1
             {it:g}:  {it:m x} 1, {it:m}>=1
        {it:result}:  {it:m x} 1.

    [{cmd:_}]{cmd:mm_exactbin(}{it:x}{cmd:,} {it:w}{cmd:,} {it:g}{cmd:,} {it:dir}{cmd:,} {it:include}{cmd:)}
    {cmd:mm_fastexactbin(}{it:x}{cmd:,} {it:w}{cmd:,} {it:g}{cmd:,} {it:dir}{cmd:,} {it:include}{cmd:)}
             {it:x}:  {it:n x} 1
             {it:w}:  {it:n x} 1 or 1 {it:x} 1
             {it:g}:  {it:m x} 1, {it:m}>=2
           {it:dir}:  1 {it:x} 1
       {it:include}:  1 {it:x} 1
        {it:result}:  {it:m}-1 {it:x} 1.

    {cmd:mm_makegrid(}{it:x}{cmd:,} {it:m}{cmd:,} {it:e}{cmd:,} {it:min}{cmd:,} {it:max}{cmd:)}:
             {it:x}:  {it:n x} 1
             {it:m}:  1 {it:x} 1
             {it:e}:  1 {it:x} 1
           {it:min}:  1 {it:x} 1
           {it:max}:  1 {it:x} 1
        {it:result}:  {it:m x} 1.


{title:Diagnostics}

{p 4 4 2}[{cmd:_}]{cmd:mm_exactbin()} and {cmd:mm_fastexactbin()} abort with error if
{cmd:min(}{it:x}{cmd:)} < {it:g}{cmd:[1]} or
{cmd:max(}{it:x}{cmd:)} > {it:g}{cmd:[rows(}{it:g}{cmd:)]}
(unless {it:include}!=0 is specified).

{p 4 4 2}[{cmd:_}]{cmd:mm_linbin()}, {cmd:mm_fastlinbin()}, [{cmd:_}]{cmd:mm_exactbin()}, and {cmd:mm_fastexactbin()}
produce erroneous results if {it:g} is not sorted or if {it:x},
{it:w}, or {it:g} contain missing values.

{p 4 4 2}{cmd:mm_fastlinbin()} and {cmd:mm_fastexactbin()} produce erroneous results if
the values in {it:g} are not equidistant.

{p 4 4 2}{cmd:_mm_linbin()} and {cmd:_mm_exactbin()}
produce erroneous results if {it:x} is not sorted.


{title:Source code}

{p 4 4 2}
{help moremata_source##mm_linbin:mm_linbin.mata},
{help moremata_source##mm_exactbin:mm_exactbin.mata},
{help moremata_source##mm_makegrid:mm_makegrid.mata}


{title:Author}

{p 4 4 2} Ben Jann, University of Bern, ben.jann@soz.unibe.ch


{title:Also see}

{p 4 13 2}
Online:  help for
{bf:{help mf_range:[M-5] range()}},
{bf:{help m4_utility:[M-4] utility}},
{bf:{help moremata}}
{p_end}
